Semantic Text Segmentation from Synthetic Images of Full-Text Documents
نویسندگان
چکیده
منابع مشابه
Clustering Full Text Documents
An index or topic hierarchy of full-text documents can organize a domain and speed information retrieval. Traditional indexes, like the Library of Congress system or Dewey Decimal system, are generated by hand, updated infrequently, and applied inconsistently. With machine learning, they can be generated automatically, updated as new documents arrive, and applied consistently. Despite the appea...
متن کاملText & Non-Text Segmentation in Colored Images
The purpose of this paper color images with complex background for text and non-text segmentation is to propose a new system. The existing text extraction methods in the case of images with complex background do not work efficiently. Locating text in case of variation in style, color, as well as complex image background makes text reading from images challenging. Here the approach used is based...
متن کاملText Segmentation from Bangla Land Map Images
Text segmentation from land map images is a non-trivial task as map components are interleaved and overlapped in a complex spatial form. The characters in a word in most of the Indic languages, including Bangla (the 6th most spoken language in the world), are connected through a headline (”matra” or ”shirorekha”) which makes the corresponding word a single component. It has been observed that t...
متن کاملText Region Segmentation From Heterogeneous Images
Text in images contains useful information which can be used to fully understand images .This paper proposes an unified method to segment a text region from images such as Scene text images , Caption text & Document images using Contourlet transform . Contourlets not only possess the main features of wavelets (namely, multiscale and time-frequency localization), but also offer a high degree of ...
متن کاملClustering multilingual documents by estimating text - to - text semantic relatedness
This thesis is about multilingual document clustering through estimating semantic relatedness between multilingual texts. Specifically we focus on the task of clustering multilingual documents with very limited or no supervisory information. We present two approaches to address the problem : a comparable-corpora based approach and a web-searches based approach. Our first approach derives pairwi...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Труды СПИИРАН
سال: 2019
ISSN: 2078-9599,2078-9181
DOI: 10.15622/sp.2019.18.6.1381-1406